LAMP-TR-070 

CS-TR-4248 

UMIACS-TR-2001-33 


May  2001 


Implicit  Cues  for  Explicit  Generation:  Using  Telicity  as  a 
Cue  for  Tense  Structure  in  a  Chinese  to  English  MT 

System 

Mari  Olsen,  David  Traum,  Carol  Van  Ess-Dykema,  Amy  Weinberg 


Language  and  Media  Processing  Labratory 
Instititue  for  Advanced  Computer  Studies 
College  Park,  MD  20742 


Abstract 

In  translating  from  Chinese  to  English,  tense  and  other  temporal  information  must  be  in¬ 
ferred  from  other  grammatical  and  lexical  cues.  Tense  information  is  crucial  to  providing 
accurate  and  fluent  translations  into  English.  Perfective  and  imperfective  grammatical 
aspect  markers  can  provide  cues  to  temporal  structure,  but  such  information  is  optional 
in  Chinese  and  is  not  present  in  the  majority  of  sentences.  We  report  on  a  project  that 
assesses  the  relative  contribution  of  the  lexical  aspect  features  of  (a)telicity  reflected  in 
the  Lexical  Conceptual  Structure  of  the  input  text,  versus  more  overt  aspectual  and  ad¬ 
verbial  markers  of  tense,  to  suggest  tense  structure  in  the  English  translation  of  a  Chinese 
newspaper  corpus.  Incorporating  this  information  allows  a  20%  to  35%  boost  in  the  accu¬ 
racy  of  tense  relization  with  the  best  accuracy  rate  of  92%  on  a  corpus  of  Chinese  articles. 
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Abstract 

In  translating  from  Chinese  to  English,  tense  and 
other  temporal  information  must  be  inferred  from 
other  grammatical  and  lexical  cues.  Tense  infor¬ 
mation  is  crucial  to  providing  accurate  and  fluent 
translations  into  English.  Perfective  and  imperfec- 
tive  grammatical  aspect  markers  can  provide  cues 
to  temporal  structure,  but  such  information  is  op¬ 
tional  in  Chinese  and  is  not  present  in  the  majority 
of  sentences.  We  report  on  a  project  that  assesses 
the  relative  contribution  of  the  lexical  aspect  fea¬ 
tures  of  (a)telicity  reflected  in  the  Lexical  Concep¬ 
tual  Structure  of  the  input  text,  versus  more  overt 
aspectual  and  adverbial  markers  of  tense,  to  sug¬ 
gest  tense  structure  in  the  English  translation  of  a 
Chinese  newspaper  corpus.  Incorporating  this  infor¬ 
mation  allows  a  20%  to  35%  boost  in  the  accuracy 
of  the  tense  realization  with  a  best  accuracy  rate  of 
92%  on  a  corpus  of  Chinese  articles. 

1  Introduction 

This  paper  advances  the  state  of  the  art  in  lexicon 
design  for  MT  by  utilizing  an  interlingua  where  as¬ 
pectual  distinctions  (telic  versus  atelic)  that  can  be 
derived  from  verb  classifications  primarily  influenced 
by  considerations  of  argument  structure  can  be  used 
to  fill  lexical  gaps  in  the  source  language  that  cannot 
be  left  unspecified  in  the  target  language.  In  trans¬ 
lating  from  Chinese  to  English,  tense  must  be  in¬ 
ferred  from  other  grammatical  and  lexical  cues.  For 
example,  Chinese  verbs  do  not  necessarily  specify 
whether  the  event  described  is  prior  or  cotempora- 
neous  with  the  moment  of  speaking.  It  is  true  that 
grammatical  aspect  information  can  be  loosely  asso¬ 
ciated  with  time,  with  imperfective  aspect  (Chinese 

zai-  and  -zhe)  representing  present  time  and 

perfective  (Chinese  le)  representing  past  time, 
(Chu,  1998;  Li  and  Thompson,  1981).  However, 
past-tense  verbs  do  not  need  any  aspect  marking 
distinguishing  them  from  present  tense  verbs.  This 
is  unlike  English,  which  much  more  rigidly  distin- 
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guishes  past  from  present  tense  through  use  of  suf¬ 
fixes.  Thus,  to  generate  an  appropriate  English  sen¬ 
tence  from  its  Chinese  counterpart,  we  need  to  fill 
in  a  potentially  unexpressed  tense. 

As  an  example,  the  final  verb  in  sentence  (1)  is 
unmarked  for  aspect,  but  must  be  realized  in  the 
past  tense. 

(1)  1  9  6  5  It  . 

1965  year  before  , 

SJt  K  W  3  0 

altogether  only  have  30 
de  shipbuilding  capacity  , 

8  U  n. 

8  ten_thousand  ton 

Before  1965  China  had  a  total  of  only  300,000 
tons  of  shipbuilding  capacity  and  the  annual 
output  was  80,000  tons. 

In  our  NLP  applications,  we  use  Lexical  Concep¬ 
tual  Structures  (LCS)  (Jackendolf,  1983)  as  an  inter¬ 
lingua,  e.g.,  for  machine  translation.  The  primitives 
of  the  interlingua  can  capture  both  conceptual  and 
syntactic  generalizations  among  languages  (Dorr  et 
ah,  1993).^  Though  LCS  primitives  deal  with  ar¬ 
gument  structures,  (Dorr  and  Olsen,  1997a)  have 
shown  how  to  map  the  predicate  types  in  the  LCS 
to  aspectual  structure.  Different  predicate  types, 
needed  for  argument  structure  mapping  can  encode 
whether  an  event  is  bounded  in  time  (telic),  or  un¬ 
bounded  (atelic).  We  will  rely  on  the  lexical  infor¬ 
mation  of  the  verbs  within  a  sentence  to  generate 
appropriately  tensed  English  translations  for  Chi¬ 
nese. 

2  Use  of  Aspect  to  Provide 
Temporal  Information 

We  now  discuss  relevant  aspectual  features  of  sen¬ 
tences,  and  show  how  this  can  provide  information 

^LCS  representations  in  our  system  have  been  created  for 
Korean,  Spanish  and  Arabic,  as  weU  as  for  English  and  Chi¬ 
nese. 


our.country 

ten_thousand  ton 

¥ 

year  output  is 


about  the  time  of  the  situations  presented  in  a  sen¬ 
tence.  Aspectual  features  can  be  divided  into  gram¬ 
matical  aspect,  which  is  indicated  by  lexical  or  mor¬ 
phological  markers  in  a  sentence,  and  lexical  aspect, 
which  is  inherent  in  the  meanings  of  words. 

2.1  Grammatical  aspect 

Grammatical  aspect  provides  a  viewpoint  on  situa¬ 
tion  (event  or  state)  structure  (Smith,  1997).  Since 
imperfective  aspect,  such  as  the  English  PROGRES¬ 
SIVE  construction  be  VERB-zn^,  views  a  situation 
from  within,  it  is  often  associated  with  present 
or  contemporaneous  time  reference.  On  the  other 
hand,  perfective  aspect,  such  as  the  English  have 
VERB-ed,  views  a  situation  as  a  whole;  it  is  there¬ 
fore  often  associated  with  past  time  reference  ((Com- 
rie,  1976;  Olsen,  1997;  Smith,  1997)  cf.  (Chu, 
1998)).  The  temporal  relations  are  tendencies:  al¬ 
though  the  perfective  is  found  more  frequently  in 
past  tenses  (Comrie,  1976),  both  imperfective  and 
perfective  co-occur  in  some  languages  with  past, 
present,  and  future  tense.  Grammatical  aspect 
marking  is  optional  in  Ghinese.  This  information 
can  be  marked  by  an  optional  post-verbal  particle. 
When  these  particles  are  present,  they  provide  help¬ 
ful  information  and  disambiguate  the  tense  interpr^ 
tation  as  shown  in  (2). 

(2)  1  9  9  1  ^  3  h  21  B  , 


1991  year  3  month  27  day  US 


member_of  Congress  Wolfe  class,  as 

%Kimj 


guest  visit  aspect  Beijing  de  one  prison 

On  march  27,1991,  Congressman  Wolfe  etc.  vis¬ 
ited  Beijing  Number  one  prison  as  guests. 

Tense  and/or  aspect  marking  is  required  for  En¬ 
glish  for  both  matrix  and  embedded  clauses.  Eor  ex¬ 
ample,  even  a  verb  like  want,  which  requires  either  a 
present  infinitive  or  past  oriented  complement  (and 
subject  drop)  (3),  indicates  whether  the  infinitive 
marks  past  or  present  time. 

(3)  Wolfe  wanted  to  {publicize  /  have  publicized} 
the  baseless  criticism  on  various  occasions. 


Leaving  out  tense  information,  or  getting  it  wrong 
during  translation  thus  compromises  both  the  flu¬ 
ency  and  the  accuracy  of  the  translation. 


2.2  Adverbial  Information 

In  addition  to  grammatical  markers,  certain  adverbs 
place  temporal  restrictions  on  the  tense  of  their  as¬ 
sociated  clauses.  Eor  example  S  yi,  and  yj 

jing  (already)  imply  the  past  tense,  while  jiang, 


jiang  lai  (will),  "2^  hui,  and  IE®  zheng 
zai  imply  a  future  interpretation.  When  available, 
we  want  to  use  these  cues  to  provide  better  transla¬ 
tions 


2.3  Lexical  aspect 

While  grammatical  aspect  and  overt  temporal  cues 
are  clearly  helpful  in  translation,  there  are  many 
cases  in  our  corpus  in  which  such  cues  are  not 
present,  as  in  (4). 

(4)  1992  ^  1995 

especially  is  from  1992  year  near_to  1995 

year  ,  foreign-capital  influx  take-on 

vertical  soaring  tendency  ,  year  average 


increase  rate  is  19 

Especially  from  1992  to  1995,  the  foreign  capital 
influx  rose  sharply.  The  average  annual  increase 
was  19  percent. 

These  are  the  hard  cases,  where  we  must  infer 
tense  or  grammatical  aspectual  marking  in  the  tar¬ 
get  language  from  a  source  that  looks  like  it  provides 
no  overt  cues.  We  will  show  however,  that  Ghinese 
does  provide  implicit  cues  through  its  lexical  aspect 
classes. 

Lexical  aspect  refers  to  the  type  of  situation  de¬ 
noted  by  the  verb,  alone  or  combined  with  other  sen¬ 
tential  constituents.  The  standard  aspectual  classes 
are  based  on  three  aspectual  features:  felicity,  dy- 
namicity  and  durativity.  We  focus  on  felicity,  also 
known  as  BOUNDEDNESS.  Verbs  that  are  telic  have 
an  inherent  end:  winning,  for  example,  ends  with 
the  finish  line.  Verbs  that  are  atelic  do  not  name 
their  end:  running  could  end  with  a  distance  run  a 
mile  or  an  endpoint  run  to  the  store,  for  example. 
Olsen  (1997)  proposed  that  aspectual  interpretation 
be  derived  through  monotonic  composition  of  fea¬ 
tures  as  shown  in  Table  1.  We  focus  on  the  felicity 
feature;  the  others  do  not  concern  us  here. 

According  to  many  researchers,  knowledge  of  lex¬ 
ical  aspect — how  verbs  denote  situations  as  devel¬ 
oping  or  holding  in  time — correlates  with  the  usual 
tense  realization  of  verbs  (Dowty,  1986;  Moens  and 
Steedman,  1988;  Passoneau,  1988).  In  particular, 
Dowty  suggests  that,  absent  other  cues,  a  telic  event 
is  interpreted  as  completed.  Smith  similarly  suggests 
that  in  English  all  past  events  are  interpreted  as  telic 
(Smith,  1997)  (but  cf.  (Olsen,  1997)). 

We  note  that  these  tendencies  are  heuristic,  and 
not  absolute.  Nonetheless  we  will  show  first  how  we 
can  read  this  information  from  the  LGS,  a  represen¬ 
tation  not  originally  designed  with  this  goal  in  mind. 


Aspectual  Class 

Telic 

Dynamic 

Durative 

Examples 

State 

+ 

know,  have 

Activity 

+ 

+ 

run,  paint 

Accomplishment 

+ 

+ 

+ 

destroy 

Achievement 

+ 

+ 

notice,  win 

Table  1:  Lexical  Aspect  Features 


and  how  this  information  can  be  used  to  guarantee 
better  Chinese  to  English  translation. 

3  Aspect  in  Lexical  Conceptual 
Structure 

Our  implementation  of  Lexical  Conceptual  Struc¬ 
ture  (LCS)  —  an  augmented  form  of  (Jackend- 
off,  1983;  .Jackendoff,  1990)  —  permits  lexical  as¬ 
pect  information  to  be  computed  from  lexical  en¬ 
tries  for  individual  verbs  as  well  as  from  com¬ 
posed  representations  for  sentences,  using  uniform 
processes  and  representations.  The  LCS  frame¬ 
work  classifies  verbs  using  primitives  (GO,  BE, 
STAY,  etc.),  types  (Event,  State,  Path,  etc.) 
and  fields  (Loc(ational) ,  Temp(oral),  Poss(essional), 
Ident(ificational),  Perc(eptual) ,  etc.).  Our  current 
working  lexicon  includes  about  10,000  English  verbs 
and  18,000  Chinese  verbs.  These  verbs  can  be  clas¬ 
sified  according  to  the  primitives  to  derive  aspec- 
tually  related  classes.  Some  examples  of  templates 
representing  classes  are  shown  in  (5),  along  with  an 
example  of  a  verb  in  that  class. 

(5) 

depart  (go  loc  (*  thing  2) 

(away_lrom  loc  (thing  2) 

(at  loc  (thing  2) 

(*  thing  4))) 

(! !+ingly  26)) 

insert  (cause  (*  thing  1) 

(go  loc  (*  thing  2) 

((*  toward  6)  loc  (thing  2) 
( [at]  loc  (thing  2) 

(thing  6)))) 

( ! !+ingly  26)) 

Telic  verbs  (and  sentences)  can  be  classed  as  either 
inherently  telic  or  derived  telic.  Some  verbs  have  an 
inherent  endpoint,  while  others  combine  with  other 
phrases  to  specify  an  end.  Telic  verbs  constructed 
with  paths  will  also  have  potential  counterpart  with 
an  atelic  verb  plus  prepositions  or  other  lexical  items 
to  add  the  requisite  path.  Depart,  for  example,  cor¬ 
responds  to  move  away,  or  something  similar  in  an¬ 
other  language. 

We  therefore  identify  telic  sentences  by  the  algo¬ 
rithm,  formally  specified  in  Figure  1  (simplified  from 
(Dorr  and  Olsen,  1997b)  [156]). 


Given  an  LCS  representation  L: 

1.  Initialize:  T(L):=[0T] 

2.  If  Top  node  of  L  G  {CAUSE,  LET,  GO} 

Then  T(L):=[-|-T] 

3.  If  Top  node  of  L  G  {ACT,  BE,  STAY} 

Then  If  Internal  node  of 
L  G  {TO,  TOWARD,  FORTemp} 

Then  T(L):=[-|-T] 

4.  Return  T(L) 

Figure  1 :  Algorithm  for  LCS  Telicity  Determination 

First  the  top  node  is  examined  for  primitives  that 
indicate  telicity:  if  the  top  node  is  CAUSE,  let,  go, 
telicity  is  set  to  [d-T],  as  with  the  verbs  break,  de¬ 
stroy,  for  example.  If  the  top  node  is  not  a  telic 
indicator  (i.e.,  the  verb  is  a  basically  atelic  predi¬ 
cate  such  as  love  or  run ,  telicity  may  still  be  still  be 
indicated  by  the  presence  of  complement  nodes,  e.g. 
a  goal  phrase  (to  primitive)  in  the  case  of  run. 

4  Predictious 

Based  on  (Dowty,  1986)  and  others,  as  discussed 
above,  we  predict  that  sentences  that  Chinese  sen¬ 
tences  that  lack  grammatical  aspect  markers  but 
have  a  telic  LCS  will  better  translate  into  English 
as  the  past  tense,  and  those  that  lack  telic  identi¬ 
fiers  will  translate  as  present  tense.  Where  present, 
grammatical  aspect  marking  or  adverbial  marking 
can  supersede  the  information  provided  by  lexical 
aspect,  with  past  adverbials  and  perfective  markers 
yielding  a  past  interpretation,  and  imperfective,  and 
future  oriented  adverbials  yielding  present  or  future 
tense  translations. 

5  Implementation:  a  Chinese  ^ 
English  Machine  Translation 
System 

LSCes  are  used  as  the  interlingua  for  our  machine 
translation  efforts.  We  have  built  a  Chinese  to  En¬ 
glish  MT  system  focussed  on  translating  newswire. 
Using  the  algorithm  described  below,  the  system  as- 
pectually  types  the  relevant  verbs  using  grammat¬ 
ical,  or  lexical  aspect,  or  adverbial  markers.  The 
LCS  thus  provides  the  bridge  from  which  the  target- 
language  sentence  is  generated. 


Following  the  principles  in  (Dorr,  1993),  lexical 
information  and  constraints  on  well-formed  LCSes 
are  used  to  compose  an  LCS  for  a  complete  sentence 
from  a  sentence  parse  in  a  source  language.  This 
composed  LCS  (CLCS)  is  then  used  as  the  starting 
points  for  generation  into  the  target  language,  us¬ 
ing  lexical  information  and  constraints  for  the  target 
language. 

The  generation  component  consists  of  the  follow¬ 
ing  subcomponents: 

Decomposition  and  lexical  selection  First, 

primitive  LCSes  for  words  in  the  target  lan¬ 
guage  are  matched  against  CLCSes,  and  tree 
structures  of  covering  words  are  selected.  Am¬ 
biguity  in  the  input  and  analysis  represented 
in  the  CLCS  is  maintained  (insofar  as  it  is 
possible  to  realize  particular  readings  using  the 
target  language  lexicon),  and  new  ambiguities 
are  introduced  when  there  are  different  ways  of 
realizing  a  CLCS  in  the  target  language. 

AMR  Construction  This  tree  structure  is  then 
translated  into  a  representation  using  the  Aug¬ 
mented  Meaning  Representation  (AMR)  syntax 
of  instances  and  hierarchical  relations  (Langk- 
ilde  and  Knight,  1998a);  however  the  rela¬ 
tions  include  information  present  in  the  CLCS 
and  LCSes  for  target  language  words,  including 
theta  roles,  LCS  type,  and  associated  features. 

Realization  The  AMR  structure  is  then  linearized, 
as  described  in  (Dorr  et  ah,  1998),  and  mor¬ 
phological  realization  is  performed.  The  result 
is  a  lattice  of  possible  realizations,  represent¬ 
ing  both  the  preserved  ambiguity  from  previous 
processing  phases  and  multiple  ways  of  lineariz¬ 
ing  the  sentence. 

Extraction  The  final  stage  uses  a  statistical  extrac¬ 
tor,  using  corpus-based  bigram  probabilities  to 
pick  an  approximation  of  the  most  fluent  real¬ 
ization  (Langkilde  and  Knight,  1998b). 

In  order  to  realize  sentences  in  English,  we  must 
have  a  tense  feature,  as  discussed  above.  As  a  worst 
case,  sentences  with  no  tense  feature  could  be  given 
a  random  tense  by  the  statistical  extractor,  or  a  de¬ 
fault  tense.  Both  of  these  options  were  tried,  yield¬ 
ing  poor  results.  For  this  reason,  the  realization  al¬ 
gorithm  has  been  augmented  with  the  rules  in  (6), 
for  creating  a  tense  feature  using  other  available  in¬ 
formation.  Items  prefixed  with  :  indicate  features 
present  in  the  Verb  AMR.  :  caspect  refers  to  gram¬ 
matical  aspect  from  the  analysis  of  the  input.  In  this 

case,  having  the  value  PERF  comes  from  (  —  le)  be¬ 
ing  a  direct  subordinate  of  the  verb.  The  :telic  fea¬ 
ture  is  computed  using  the  algorithm  in  Figure  1 .  Fi¬ 
nally,  the  :  headline  feature  is  assumed  to  be  added 


by  a  pre-processing  phase,  identifying  an  input  sen¬ 
tence  as  being  a  newspaper  headline. 

(6)  If  :  tense  feature  in  the  input 

then  use  input  value  for  : tense 
else  if  :  headline  + 

then  : tense  =  present 
else  if  :  caspect  PERF 

then  : tense  =  past 

else  if  adverb  S  or  S 

then  : tense  =  past 

else  if  adverb  W,  or 

then  tense  =  present 
else  if  :telic  + 

then  : tense  =  past 
else  :  tense  =  present 

6  The  Corpus 

We  have  applied  this  machine  translation  system  to 
a  corpus  of  Chinese  newspaper  text  from  Xinhua  and 
other  sources,  primarily  in  the  economics  domain. 
The  genre  is  roughly  comparable  to  the  American 
Wall  Street  Journal.  Chinese  newspaper  genre  dif¬ 
fers  from  other  Chinese  textual  sources,  in  a  number 
of  ways,  including: 

•  more  complex  sentence  structure 

•  more  extensive  use  of  acronyms 

•  less  use  of  Classical  Chinese 

•  more  representative  grammar 

•  more  constrained  vocabulary  (limited  lexicon) 

•  abbreviations  are  used  extensively  in  Chinese 
newspaper  headlines 

In  order  to  test  our  hypothesis,  we  divided  a  152 
sentence  newswire  corpus  into  an  99  verb  training 
set,  and  a  72  verb  test  set  (some  sentences  had  more 
than  one  main  verb) .  The  sentence  structure  is  com¬ 
plex  and  stylized;  with  an  average  of  20  words  per 
sentence  in  both  the  training  and  test  corpora. 

To  evaluate  the  extent  to  which  our  predictions  re¬ 
sult  in  an  improvement  in  translation,  we  have  used 
a  database  of  human  translations  of  the  sentences 
in  our  corpus  as  the  ground  truth,  or  gold  standard. 
The  translations  were  constructed  to  provide  fluent 
English  for  comprehension,  and  not  for  the  purposes 
of  this  experiment.  In  evaluating  our  results,  we  con¬ 
centrate  on  how  well  the  system  did  at  matching  past 
and  present  tenses  to  those  provided  by  a  human. 

7  Results 

The  training  corpus  was  used  to  refine  the  tense  al¬ 
gorithm  in  (6),  yielding  success  of  greater  than  90%. 
We  have  subsequently  applied  this  algorithm  to  gen¬ 
erate  tense  for  the  72  additional  clauses  in  the  test 


set,  which  had  not  been  previously  studied.  Eval¬ 
uation  can  be  very  difficult  in  a  number  of  cases. 
Concerning  tense,  our  “gold  standard”  is  the  set  of 
human  translations,  previously  constructed  for  these 
sentences.  In  many  cases,  there  is  nothing  overt  in 
the  sentence  which  would  specify  tense,  so  a  mis¬ 
match  might  not  actually  be  “wrong” .  Also,  there 
are  a  number  of  sentences  which  were  not  directly 
applicable  for  comparison,  such  as  when  the  human 
translator  chose  a  different  syntactic  structure  or  a 
complex  tense.  These  verbs  either  appeared  in  sim¬ 
ple  present,  past,  present  or  past  perfect  (has  or  had 
verb-fed),  present  or  past  imperfective  (is  verb-fing, 
was  verb-fing)  and  their  corresponding  passive  (is 
being  kicked,  was  being  kicked,  have  been  kicked) 
forms.  For  cases  like  the  present  perfect  (has  kicked), 
we  noted  the  intended  meaning  (e.g,  past  activity) 
expressed  by  the  verb  as  well  as  the  verb’s  actual 
present  perfective  form.  We  scored  the  form  as  cor¬ 
rect  if  the  system  translated  a  present  perfective 
with  past  tense  meaning  as  a  simple  past.  The  re¬ 
sults  of  our  evaluation  are  summarized  in  the  tables 
below.  The  first  table  uses  headline,  grammatical 
aspect,  adverbials,  and  lexical  information.  The  re¬ 
sults  using  only  lexical  information  are  summarized 
in  the  following  table. 


even  when  only  past  time  is  in  question.  For  now, 
this  must  remain  a  speculation.  Results  are  also 
clearly  better  than  always  picking  present  tense,  or 
just  using  one  of  the  features  of  grammatical  or  lex¬ 
ical  aspect.  We  also  note  that  in  2  cases  of  head¬ 
lines,  felicity  alone  would  have  predicted  past  tense, 
but  the  human  translation  used  the  present  tense. 
Headlines  are  written  using  the  historical  present  in 
English  (“Man  bites  Dog”). 

8  Conclusions 

We  therefore  conclude  that  lexical  and  grammati¬ 
cal  aspect  can  serve  as  a  valuable  heuristic  for  sug¬ 
gesting  tense,  in  the  absence  of  tense  and  other 
temporal  markers.  In  addition,  lexical  aspect,  as 
represented  by  the  interlingual  LCS  structure,  can 
serve  as  the  foundation  for  language  specific  heuris¬ 
tics.  Thus,  the  interlingual  representation  may  be 
used  to  provide  not  only  shared  semantic  and  syn¬ 
tactic  structure,  but  also  the  building  blocks  for 
language-specific  heuristics  for  mismatches  between 
languages.  More  importantly,  it  can  be  used  to  infer 
information  not  overtly  present  in  the  string  or  syn¬ 
tactic  structure  of  a  language,  leading  to  fluent  and 
accurate  translation. 


human 

translation 


generated  tense 


past 

present 

past 

22 

0 

present 

6 

44 

Table  2:  Preliminary  Tense  Results  using  all  Info 


human 

translation 


generated  tense 


past 

present 

past 

18 

4 

present 

13 

37 

Table  3:  Preliminary  Tense  Results  using  Lexical 
Only 


9  Future  Research 

There  are  a  number  of  other  directions  we  intend 
to  pursue  in  extending  this  work.  First,  we  plan  to 
try  this  on  larger  scale  corpora.  We  also  plan  to 
extend  our  work  to  uncovering  implicit  discourse  re¬ 
lations,  capitalizing  on  the  insight  that  completed 
events  usually  indicate  sequentiality  while  uncom¬ 
pleted  events  are  co-temporaneous.  We  would  also 
like  to  extend  this  approach  to  other  information 
contained  in  the  LCS  (e.g.,  causality),  and  to  inves¬ 
tigate  further  whether  we  can  predict  when  an  overt 
marker  is  likely  to  be  used. 
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